-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce EntityRetrievingClosestReferencedEntityIdLookup #195
Introduce EntityRetrievingClosestReferencedEntityIdLookup #195
Conversation
4200f7c
to
a0537ac
Compare
* @param int $expectedNumberOfGetEntityCalls | ||
* @return EntityLookup | ||
*/ | ||
private function maskEntityLookup( EntityLookup $entityLookup, $expectedNumberOfGetEntityCalls ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is mask
a typo? If not, I am confused by this verb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not, but I see that it's not accurate… so changed to restrictEntityLookup
.
|
||
/** | ||
* @param EntityLookup $entityLookup | ||
* @param int $expectedNumberOfGetEntityCalls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have this argument an an if-else? Looks like you can just have two simple methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, can even be one simpler method… changed to just use phpunit mocks all the time and not sometimes rely on RestrictedEntityLookup
.
* @param int $expectedPrefetches | ||
* @return EntityPrefetcher | ||
*/ | ||
private function getEntityPrefetcher( $expectedPrefetches ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the fundraising codebase we use the convention to name such methods in tests that construct new stuff newStuff
and not getStuff
. I think that is clearer, since then you know you get a new instance every time and can happily cause changes in its state without other stuff blowing up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also a fan of the new…
naming convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general, not tested manually yet (but I guess that would have to wait until a Lua binding is available in some other change?)
return $result; | ||
} | ||
} | ||
// Remove already visitted entities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: visited
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :)
} | ||
|
||
/** | ||
* @return InMemoryEntityLookup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anything else depend on this specifically returning an InMemoryEntityLookup and not any EntityLookup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, changed the return type hint.
* @param int $expectedPrefetches | ||
* @return EntityPrefetcher | ||
*/ | ||
private function getEntityPrefetcher( $expectedPrefetches ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also a fan of the new…
naming convention.
private $entityPrefetcher; | ||
|
||
/** | ||
* @var int Maximum search depth: Maximum number of intermediate entities to search through. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is confusing. A "maximum depth" is not the same as a "maximum number of entities to search through". This search spans a tree. If the tree is very flat, the search depth might be something like 5, while a few thousand entities have been traversed already. Please count the entities, and not the depth. We run into issues with this in the QualityConstrains codebase already. @lucaswerkmeister might be able to provide an example.
Further down in the constructor I see both a depth and some maximum number. It's hard to tell what they do. I think a depth is not needed.
If you think having both is helpful, I would not make the depth throw an exception. An exception is only thrown if the maximum number of entities is reached. My reasoning for this idea is: Think of a tree with two branches, the first is extremely deep, while the second is only a few entities deep. Both together exceed the "maximum number of entities" setting. If the depth-check throws an exception, nothing would be found. But if the depth-check only makes the code stop traversing a particular branch, the second, smaller branch would be found.
I hope this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope I correctly understood your comment, if not, please correct me!
Please note that this is not doing a depth-first search, but a breadth-first search, thus there's no need to make "the code stop traversing a particular branch". In this context I think it might be useful for a caller to say that they only want us to look for paths with a certain maximum length.
The entity visit limitation is just an added safety, which I don't imagine firing very often, but only in specific cases where one or more visited entities references many other entities.
I agree that the documentation for this might be improvable. I'll see what I can do there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I still don't know what this integer represents. The documentation uses the words "depth" (as in how deep a tree is) and "number" (as in the number of visited leaves) in the same sentence. For me these are two very different concepts. Which one does this integer represent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that this represents the maximum depth of the tree. You can also think of the depth as the number of entities above the current leaf (i. e., intermediate entities) at any point. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also think of the depth as the number of entities above the current leaf […]
"Above the current leave" is an other tree. What "number of entities" from this tree is meant? A horizontal or a vertical slice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maximum number of intermediate entities to search through
, this means given we have a path Q1 -> … -> Q42
, this is the number of entities represented by the …
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So "maxDepth" is indeed the maximum depth of that particular tree. :-) Thanks!
return null; | ||
} | ||
|
||
$this->alreadyVisited = []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't you keep this cache for the lifetime of the object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because we want to maybe visit the same entities again, if another (or even the same) reference-relationship is queried.
We may have a caching decorator for this in the future, but that's not this classes business.
* @return StatementListProvider|null Null if not applicable. | ||
*/ | ||
private function getEntity( EntityId $id, EntityId $fromId, PropertyId $propertyId, array $toIds ) { | ||
$this->alreadyVisited[$id->getSerialization()] = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I miss a safety check here that starts screaming when this is called with an already visited entity. As far as I can see this is not required, as the caller makes sure this can't happen. But this is a little hidden in an array_diff, and I would feel better with an additional isset check. It's cheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a LogicException
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this to actually trigger a warning now… throwing an exception is maybe a bit harsh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trigger_error
causes a global side effect. Without knowing the context here I can't say what is appropriate, though normally you either throw an exception and let the caller deal with it, which might be ignoring the error, escalating it, logging it, etc. Or, in cases where the execution in the local context should not be aborted, you log something or otherwise output info via some collaborator (which could also be an event thing or whatever).
a0537ac
to
1d0f5bd
Compare
One thing I just remembered – what’s the status of usage tracking in this service? We discussed this in #193, but I don’t see anything related in this pull request. |
I discussed this with @lydiapintscher and we probably don't need usage tracking for now. If we want usage tracking later on, we would need this service to track the specific patch chosen… or in this case, where the service even returns the shortest path, we would need to track all visited entities, if we wanted to keep this guarantee all the time. |
Part of https://phabricator.wikimedia.org/T179155